162 research outputs found

    On the asymptotic optimality of greedy index heuristics for multi-action restless bandits

    Get PDF
    The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual bandits have multiple levels of activation but are subject to an overall resource constraint. The contribution is motivated by the recent works of Glazebrook et al. (2011a), (2011b) who discussed the performance of index heuristics for resource allocation in such systems. Hitherto, index heuristics have been shown, under a condition of full indexability, to be optimal for a natural Lagrangian relaxation of such problems in which a resource is purchased rather than constrained. We find that under key assumptions about the nature of solutions to a deterministic differential equation that the index heuristics above are asymptotically optimal in a sense described by Whittle. We then demonstrate that these assumptions always hold for three-state bandits

    Developing effective service policies for multiclass queues with abandonment:asymptotic optimality and approximate policy improvement

    Get PDF
    We study a single server queuing model with multiple classes and impatient customers. The goal is to determine a service policy to maximize the long-run reward rate earned from serving customers net of holding costs and penalties respectively due to customers waiting for and leaving before receiving service. We first show that it is without loss of generality to study a pure-reward model. Since standard methods can usually only compute the optimal policy for problems with up to three customer classes, our focus is to develop a suite of heuristic approaches, with a preference for operationally simple policies with good reward characteristics. One such heuristic is the Rμθ rule—a priority policy that ranks all customer classes based on the product of reward R, service rate μ, and abandonment rate θ. We show that the Rμθ rule is asymptotically optimal as customer abandonment rates approach zero and often performs well in cases where the simpler Rμ rule performs poorly. The paper also develops an approximate policy improvement method that uses simulation and interpolation to estimate the bias function for use in a dynamic programming recursion. For systems with two or three customer classes, our numerical study indicates that the best of our simple priority policies is near optimal in most cases; when it is not, the approximate policy improvement method invariably tightens up the gap substantially. For systems with five customer classes, our heuristics typically achieve within 4% of an upper bound for the optimal value, which is computed via a linear program that relies on a relaxation of the original system. The computational requirement of the approximate policy improvement method grows rapidly when the number of customer classes or the traffic intensity increases

    Assessing an intuitive condition for stability under a range of traffic conditions via a generalised Lu-Kumar network

    Get PDF
    We argue the importance both of developing simple sufficient conditions for the stability of general multiclass queueing networks and also of assessing such conditions under a range of assumptions on the weight of the traffic flowing between service stations. To achieve the former, we review a peak-rate stability condition and extend its range of application and for the latter, we introduce a generalisation of the Lu-Kumar network on which the stability condition may be tested for a range of traffic configurations. The peak-rate condition is close to exact when the between-station traffic is light, but degrades as this traffic increases.Multiclass queueing networks, stability, fluid model, Lu-Kumar network

    On the identification and mitigation of weaknesses in the Knowledge Gradient policy for multi-armed bandits

    Get PDF
    The Knowledge Gradient (KG) policy was originally proposed for online ranking and selection problems but has recently been adapted for use in online decision making in general and multi-armed bandit problems (MABs) in particular. We study its use in a class of exponential family MABs and identify weaknesses, including a propensity to take actions which are dominated with respect to both exploitation and exploration. We propose variants of KG which avoid such errors. These new policies include an index heuristic which deploys a KG approach to develop an approximation to the Gittins index. A numerical study shows this policy to perform well over a range of MABs including those for which index policies are not optimal. While KG does not make dominated actions when bandits are Gaussian, it fails to be index consistent and appears not to enjoy a performance advantage over competitor policies when arms are correlated to compensate for its greater computational demands

    Applications of stochastic modeling in air traffic management:Methods, challenges and opportunities for solving air traffic problems under uncertainty

    Get PDF
    In this paper we provide a wide-ranging review of the literature on stochastic modeling applications within aviation, with a particular focus on problems involving demand and capacity management and the mitigation of air traffic congestion. From an operations research perspective, the main techniques of interest include analytical queueing theory, stochastic optimal control, robust optimization and stochastic integer programming. Applications of these techniques include the prediction of operational delays at airports, pre-tactical control of aircraft departure times, dynamic control and allocation of scarce airport resources and various others. We provide a critical review of recent developments in the literature and identify promising research opportunities for stochastic modelers within air traffic management

    The Mass Assembly Histories of Galaxies of Various Morphologies in the GOODS Fields

    Full text link
    We present an analysis of the growth of stellar mass with cosmic time partitioned according to galaxy morphology. Using a well-defined catalog of 2150 galaxies based, in part, on archival data in the GOODS fields, we assign morphological types in three broad classes (Ellipticals, Spirals, Peculiar/Irregulars) to a limit of z_AB=22.5 and make the resulting catalog publicly available. We combine redshift information, optical photometry from the GOODS catalog and deep K-band imaging to assign stellar masses. We find little evolution in the form of the galaxy stellar mass function from z~1 to z=0, especially at the high mass end where our results are most robust. Although the population of massive galaxies is relatively well established at z~1, its morphological mix continues to change, with an increasing proportion of early-type galaxies at later times. By constructing type-dependent stellar mass functions, we show that in each of three redshift intervals, E/S0's dominate the higher mass population, while spirals are favored at lower masses. This transition occurs at a stellar mass of 2--3 times 10^{10} Msun at z~0.3 (similar to local studies) but there is evidence that the relevant mass scale moves to higher mass at earlier epochs. Such evolution may represent the morphological extension of the ``downsizing'' phenomenon, in which the most massive galaxies stop forming stars first, with lower mass galaxies becoming quiescent later. We infer that more massive galaxies evolve into spheroidal systems at earlier times, and that this morphological transformation may only be completed 1--2 Gyr after the galaxies emerge from their active star forming phase. We discuss several lines of evidence suggesting that merging may play a key role in generating this pattern of evolution.Comment: 24 pages, 1 table, 8 figures, accepted for publication in Ap

    The achievable region approach to the optimal control of stochastic systems

    Get PDF
    The achievable region approach seeks solutions to stochastic optimisation problems by: (i) characterising the space of all possible performances (the achievable region) of the system of interest, and (ii) optimising the overall system-wide performance objective over this space. This is radically different from conventional formulations based on dynamic programming. The approach is explained with reference to a simple two-class queueing system. Powerful new methodologies due to the authors and co-workers are deployed to analyse a general multiclass queueing system with parallel servers and then to develop an approach to optimal load distribution across a network of interconnected stations. Finally, the approach is used for the first time to analyse a class of intensity control problems.Achievable region, Gittins index, linear programming, load balancing, multi-class queueing systems, performance space, stochastic optimisation threshold policy

    Modeling and analysis of uncertain time-critical tasking problems

    Get PDF
    Naval Research Logistics, 53 , No. 6, (Sept. 2006), 588-599.This paper describes modeling and operational analysis of a generic asymmetric services-system situation in which (a) Red agents, potentially threatening, but in another but important interpretation, are isolated friendlies, such as downed pilots, that require assistance and "arrive" according to some partially known and potentially changing pattern in time and space: and (b) Reds have effectively limited unknown deadlines or times of availability for Blue service, i.e., detection, classification, and attack in a military setting or emergency assistance in others. We discuss various service options by Blue service agents and devise several approximations allowing one to compute efficiently those proportions of tasks of different classes that are successfully serviced, or more generally, if different rewards are associated with different classes of tasks, the percentage of the possible reward gained. We suggest heuristic policies of a Blue server to select the next task to perform and to decide how much time to allocate to that service. We discuss this for a number of specific examples

    A Classical Search Game In Discrete Locations

    Get PDF
    Consider a two-person zero-sum search game between a hider and a searcher. The hider hides among n discrete locations, and the searcher successively visits individual locations until finding the hider. Known to both players, a search at location i takes ti time units and detects the hider—if hidden there—independently with probability αi, for i = 1,...,n. The hider aims to maximize the expected time until detection, while the searcher aims to minimize it. We prove the existence of an optimal strategy for each player. In particular, any optimal mixed hiding strategy hides in each location with a nonzero probability, and there exists an optimal mixed search strategy which can be constructed with up to n simple search sequences
    • …
    corecore